

Remarks

The present amendment replies to the Official Action mailed August 3, 2001. That action objected to the specification as informal. Claims 1-3, 9, 10, and 38-46 were rejected under 35 U.S.C. 103 (a) over Herrell et al. U.S. Patent No. 5,301,287. Claims 4-7 and 11-13 were rejected under 35 U.S.C. 103 (a) over Herrell in view of McLellan et al. U.S. Patent No. 5,890,201. Claims 8, 14, and 15 were objected to as being dependent upon a rejected base claim, but were indicated to be allowable if rewritten in independent form including all limitations of the base claim and any intervening claims. Each of the points raised by the Official Action is addressed below following a brief discussion of the present invention to provide context.

Claims 1 and 42 have been amended to be more clear and distinct. Claims 16-37 have been previously cancelled. Claims 1-15 and 38-46 are presently pending. Attached hereto is a marked-up version showing the changes made to the specification and claims by the current amendment. The attached pages are captioned Version with Markings to Show Changes Made.

The Present Invention

The present invention relates generally to improvements in array processing, and more particularly to advantageous techniques for providing improved mechanisms of data distribution to, and collection from multiple memories often associated with and local to processing elements within an array processor.

Various prior art techniques exist for the transfer of data between system memories or between system memories and I/O devices. Fig. 1 of the present application shows a conventional data processing system 100 comprising a host uniprocessor 110, processor local memory 120, direct memory access (DMA) controller 160, system memory 150 which is usually

a larger memory store than the processor local memory, having longer access latency, and input/output (I/O) devices 130 and 140.

The DMA controller 160 provides a mechanism for transferring data between processor local memory and system memory or I/O devices concurrent with uniprocessor execution. DMA controllers are sometimes referred to as I/O processors or transfer processors in the literature. System performance is improved since the host uniprocessor can perform computations while the DMA controller is transferring new input data to the processor local memory and transferring result data to output devices or the system memory. A data transfer is typically specified with the following minimum set of parameters: source address, destination address, and number of data elements to transfer. Addresses are interpreted by the system hardware and uniquely specify I/O devices or memory locations from which data must be read or to which data must be written. Sometimes additional parameters are provided such as element size.

One of the limitations of conventional DMA controllers is that address generation capabilities for the data source and data destination are often constrained to be the same. For example, when only a source address, destination address and a transfer count are specified, the implied data access pattern is block-oriented, that is, a sequence of data words from contiguous addresses starting with the source address is copied to a sequence of contiguous addresses starting at the destination address.

Array processing presents challenges for data collection and distribution both in terms of addressing flexibility, control and performance. The patterns in which data elements are distributed and collected from processing element local memories can significantly affect the overall performance of the processing system. With the advent of the ManArray architecture it has been recognized that it will be advantageous to have improved techniques for data transfer which provide these capabilities and which are tailored to this new architecture.

The present invention addresses a variety of advantageous methods and apparatus for improved data transfer control within a data processing system. In particular, improved techniques are provided for: distributing data to, and collecting data from an array of processing elements (PEs) in a flexible and efficient manner; and PE address translation which allows data distribution and collection based on PE virtual IDs.

Further aspects of the present invention are related to a virtual-to-physical PE ID translation which works together with a ManArray PE interconnection topology to support a variety of communication models (such as hypercube and mesh) through data placement based upon a PE virtual ID. This result can be accomplished in a DMA controller by translation, through a VID-to-PID lookup table or through combinational logic, where the resulting PID becomes an addressing component on the DMA bus to PE local memories. This result can also be achieved at the PE local memories within the interface logic, where a VID available to the interface logic is compared to a VID presented on the DMA bus. A match at a particular memory interface allows that memory to accept the access.

#### Objection to Specification

The specification has been amended to address the informality objection.

#### 35 U.S.C. 103 Rejections

The claims as presently amended make it clear that the present invention addresses techniques for improved array processing in which mechanisms are provided for effective data distribution to and collection from multiple memories associated with and local to processing elements within the array processor.

By contrast, Herrell relates to aspects of an entirely different problem. Herrell states that it "relates to a method and apparatus for providing direct access by an external data processing system to data stored in the main memory of a host system, and more particularly, to an interface method and apparatus for providing direct memory access by an external data processing system, such as a graphics subsystem, to virtual memory of the host system by transferring the contents of main memory of the host system at a location in virtual memory space specified by the user to the external data processing system under the user's control." Col. 1, lines 10-20. Reconsideration and withdrawal of the present rejection are respectfully requested.

Conclusion

All of the claims standing in order for allowance, this case should be promptly allowed. Should there be any issues which might be expedited by a telephone call, the Examiner is requested to call the undersigned at the number below.

Respectfully submitted,



Peter H. Priest  
Reg. No. 30,210  
Priest & Goldstein, PLLC  
529 Dogwood Drive  
Chapel Hill, NC 27516  
(919) 942-1434

VERSION WITH MARKINGS TO SHOW CHANGES MADEIn the Specification

Please change the first two sentences of the present application as follows:

[This] The present application is a division of [application] U.S. Application Serial Number 09/472,372 filed on December 23, 1999, now U.S. Patent No. 6,256,683, which in turn claimed [.] The present application claims] the benefit of U.S. Provisional Application Serial No. 60/113,637 entitled "Methods and Apparatus for Providing Direct Memory Access (DMA) Engine" and filed December 23, 1998 which is incorporated by reference in its entirety herein.

Please replace the paragraph beginning at page 6, line 1 and extending to page 7, line 19 as follows:

Further details of a presently preferred ManArray core, architecture, and instructions for use in conjunction with the present invention are found in U.S. Patent Application Serial No. 08/885,310 filed June 30, 1997, now U.S. Patent No. 6,023,753, U.S. Patent Application Serial No. 08/949,122 filed October 10, 1997, now U.S. Patent No. 6,167,502, U.S. Patent Application Serial No. 09/169,255 filed October 9, 1998, U.S. Patent Application Serial No. 09/169,256 filed October 9, 1998, now U.S. Patent No. 6,167,501, U.S. Patent Application Serial No. 09/169,072 filed October 9, 1998, now U.S. Patent No. 6,219,776, U.S. Patent Application Serial No. 09/187,539 filed November 6, 1998, now U.S. Patent No. 6,151,668, U.S. Patent Application Serial No. 09/205,558 filed December 4, 1998, now U.S. Patent No. 6,173,389, U.S. Patent Application Serial No. 09/215,081 filed December 18, 1998, now U.S. Patent No. 6,101,592, U.S. Patent Application Serial No. 09/228,374 filed January 12, 1999, now U.S. Patent No. 6,216,223 [and entitled "Methods and Apparatus to Dynamically Reconfigure the Instruction

Pipeline of an Indirect Very Long Instruction Word Scalable Processor"], U.S. Patent Application Serial No. 09/238,446 filed January 28, 1999, U.S. Patent Application Serial No. 09/267,570 filed March 12, 1999, U.S. Patent Application Serial No. 09/337,839 filed June 22, 1999, U.S. Patent Application Serial No. 09/350,191 filed July 9, 1999, U.S. Patent Application Serial No. 09/422,015 filed October 21, 1999 [entitled "Methods and Apparatus for Abbreviated Instruction and Configurable Processor Architecture"], U.S. Patent Application Serial No. 09/432,705 filed November 2, 1999 [entitled "Methods and Apparatus for Improved Motion Estimation for Video Encoding"], U.S. Patent Application Serial No. [ ] 09/471,217 filed December 23, 1999, now U.S. Patent No. 6,260,082 [entitled "Methods and Apparatus for Providing Data Transfer Control"], as well as, [Provisional Application Serial No. 60/113,637 entitled "Methods and Apparatus for Providing Direct Memory Access (DMA) Engine" filed December 23, 1998, Provisional Application Serial No. 60/113,555 entitled "Methods and Apparatus Providing Transfer Control" filed December 23, 1998,] Provisional Application Serial No. 60/139,946 entitled "Methods and Apparatus for Data Dependent Address Operations and Efficient Variable Length Code Decoding in a VLIW Processor" filed June 18, 1999, Provisional Application Serial No. 60/140,245 entitled "Methods and Apparatus for Generalized Event Detection and Action Specification in a Processor" filed June 21, 1999, Provisional Application Serial No. 60/140,163 entitled "Methods and Apparatus for Improved Efficiency in Pipeline Simulation and Emulation" filed June 21, 1999, Provisional Application Serial No. 60/140,162 entitled "Methods and Apparatus for Initiating and Re-Synchronizing Multi-Cycle SIMD Instructions" filed June 21, 1999, Provisional Application Serial No. 60/140,244 entitled "Methods and Apparatus for Providing One-By-One Manifold Array (1x1 ManArray) Program Context Control" filed June 21, 1999, Provisional Application Serial No. 60/140,325 entitled "Methods and Apparatus for Establishing Port Priority Function in a VLIW

Processor" filed June 21, 1999, Provisional Application Serial No. 60/140,425 entitled "Methods and Apparatus for Parallel Processing Utilizing a Manifold Array (ManArray) Architecture and Instruction Syntax" filed June 22, 1999, Provisional Application Serial No. 60/165,337 entitled "Efficient Cosine Transform Implementations on the ManArray Architecture" filed November 12, 1999, and Provisional Application Serial No. [ ] 60/171,911 entitled "Methods and Apparatus for [DMA] Loading of Very Long Instruction Word Memory" filed December 23, 1999, respectively, all of which are assigned to the assignee of the present invention and incorporated by reference herein in their entirety.

Please replace the paragraph at page 12, lines 1-12 as follows:

Each transfer controller within a ManArray DMA controller is designed to fetch its own stream of DMA instructions. DMA instructions are of five basic types: transfer; branch; load; synchronization; and state control. The branch, load, synchronization, and state control types of instructions are collectively referred to as "control instructions", and distinguished from the transfer instructions which actually perform data transfers. DMA instructions are typically of multi-word length and require a variable number of cycles to execute although several control instructions require only a single word to specify. Although the presently preferred embodiment supports multiple DMA instruction types as described in further detail in U.S. Patent Application Serial No. [ ] entitled "Methods and Apparatus for Providing Data Transfer Control" [09/471,217 filed December 23, 1999, now U.S. Patent No. 6,260,082, and incorporated by reference in its entirety herein, the present invention focuses on instructions and mechanisms which provide for flexible and efficient data transfers to and from multiple memories.

Please replace the paragraph at page 20, lines 13-21 as follows:

The following aspects of the loop formulation are noted. When the requested number of accesses are made (TC in Figs. 10-12) then all loops are exited immediately, leaving all address and loop control variables in their current states. By using logical "while" loops and reinitializing a loop only at its exit, it is possible to reenter the loops and continue a transfer after "terminal count" (TC) addresses have been accessed. This capability is used in this invention to allow transfers to be restarted so that the addressing continues as though it would if the transfer count had not been exhausted. For further details of such transfers see U.S. Application Serial No. [ ] 09/471,217 filed December 23, 1999 [entitled "Methods and Apparatus for Providing Data Transfer Control"], now U.S. Patent No. 6,260,082, which is incorporated by reference in its entirety herein.

#### In the Claims

1. (Amended) An apparatus for performing virtual identification (VID) to physical identification (PID) translation for data elements to be accessed within local memory of a processing element (PE) whereby a direct memory access (DMA) controller can access PE local memories according to their VIDs, the apparatus comprising:

an array of multiple PEs each having local PE memory;

a DMA controller; and

a memory maintained in the DMA controller for storing a processing element VID-to-PID table mapping processing element VIDs to processing element PIDs utilized by the DMA controller to access local memories according to their VIDs.

42. (Amended) A processing apparatus comprising:

a plurality of processing elements (PEs) communicatively connected by a bus, each PE comprising a register storing a virtual identification number (VID) identifying the PE; and a direct memory access (DMA) controller connected to the bus for accessing local data memory of the PEs, each data access at least partially identified by a VID; wherein during a common data access to multiple PEs, a PE responds to the data access if the VID stored in the register matches the VID of the data access.